Current Issue : July-September Volume : 2024 Issue Number : 3 Articles : 5 Articles
Autonomous object manipulation is a challenging task in robotics because it requires an essential understanding of the object’s parameters such as position, 3D shape, grasping (i.e., touching) areas, and orientation. This work presents an autonomous object manipulation system using an anthropomorphic soft robot hand with deep learning (DL) vision intelligence for object detection, 3D shape reconstruction, and object grasping area generation. Object detection is performed using Faster-RCNN and an RGB-D sensor to produce a partial depth view of the objects randomly located in the working space. Three-dimensional object shape reconstruction is performed using U-Net based on 3D convolutions with bottle-neck layers and skip connections generating a complete 3D shape of the object from the sensed single-depth view. Then, the grasping position and orientation are computed based on the reconstructed 3D object information (e.g., object shape and size) using U-Net based on 3D convolutions and Principal Component Analysis (PCA), respectively. The proposed autonomous object manipulation system is evaluated by grasping and relocating twelve objects not included in the training database, achieving an average of 95% successful object grasping and 93% object relocations....
Speech emotion recognition (SER) is a challenging task due to the complex and subtle nature of emotions. This study proposes a novel approach for emotion modeling using speech signals by combining discrete wavelet transform (DWT) with linear prediction coding (LPC). The performance of various classifiers, including support vector machine (SVM), K-Nearest Neighbors (KNN), Efficient Logistic Regression, Naive Bayes, Ensemble, and Neural Network, was evaluated for emotion classification using the EMO-DB dataset. Evaluation metrics such as area under the curve (AUC), average prediction accuracy, and cross-validation techniques were employed. The results indicate that KNN and SVM classifiers exhibited high accuracy in distinguishing sadness from other emotions. Ensemble methods and Neural Networks also demonstrated strong performance in sadness classification. While Efficient Logistic Regression and Naive Bayes classifiers showed competitive performance, they were slightly less accurate compared to other classifiers. Furthermore, the proposed feature extraction method yielded the highest average accuracy, and its combination with formants or wavelet entropy further improved classification accuracy. On the other hand, Efficient Logistic Regression exhibited the lowest accuracies among the classifiers. The uniqueness of this study was that it investigated a combined feature extraction method and integrated them to compare with various forms of combinations. However, the purposes of the investigation include improved performance of the classifiers, high effectiveness of the system, and the potential for emotion classification tasks. These findings can guide the selection of appropriate classifiers and feature extraction methods in future research and real-world applications. Further investigations can focus on refining classifiers and exploring additional feature extraction techniques to enhance emotion classification accuracy....
Estimation of vivo muscle forces during human motion is important for understanding human motion control mechanisms and joint mechanics. This paper combined the advantages of the convolutional neural network (CNN) and long-short-term memory (LSTM) and proposed a novel muscle force estimation method based on CNN–LSTM. A wearable sensor system was also developed to collect the angles and angular velocities of the hip, knee, and ankle joints in the sagittal plane during walking, and the collected kinematic data were used as the input for the neural network model. In this paper, the muscle forces calculated using OpenSim based on the Static Optimization (SO) method were used as the standard value to train the neural network model. Four lower limb muscles of the left leg, including gluteus maximus (GM), rectus femoris (RF), gastrocnemius (GAST), and soleus (SOL), were selected as the studying objects in this paper. The experiment results showed that compared to the standard CNN and the standard LSTM, the CNN–LSTM performed better in muscle forces estimation under slow (1.2 m/s), medium (1.5 m/s), and fast walking speeds (1.8 m/s). The average correlation coefficients between true and estimated values of four muscle forces under slow, medium, and fast walking speeds were 0.9801, 0.9829, and 0.9809, respectively. The average correlation coefficients had smaller fluctuations under different walking speeds, which indicated that the model had good robustness. The external testing experiment showed that the CNN–LSTM also had good generalization. The model performed well when the estimated object was not included in the training sample. This article proposed a convenient method for estimating muscle forces, which could provide theoretical assistance for the quantitative analysis of human motion and muscle injury. The method has established the relationship between joint kinematic signals and muscle forces during walking based on a neural network model; compared to the SO method to calculate muscle forces in OpenSim, it is more convenient and efficient in clinical analysis or engineering applications....
The spread of high-performance personal computers, frequently equipped with powerful Graphic Processing Units (GPUs), has raised interest in a set of techniques that are able to extract models of electromagnetic phenomena (and devices) directly from available examples of desired behavior. Such approaches are collectively referred to as Machine Learning (ML). A typical representative ML approach is the so-called “Neural Network” (NN). Using such data-driven models allows the evaluation of the output in a much shorter time when a theoretical model is available, or allows the prediction of the behavior of the systems and devices when no theoretical model is available. With reference to a simple yet representative benchmark electromagnetic problem, some of the possibilities and pitfalls of the use of NNs for the interpretation of measurements (inverse problem) or to obtain required measurements (optimal design problem) are discussed. The investigated aspects include the choice of NN model, the generation of the dataset(s), and the selection of hyper-parameters (hidden layers, training paradigm). Finally, the capabilities in the handling of ill-posed problems are critically revised....
Smart cities are now embracing the new frontier of urban living, with advanced technology being used to enhance the quality of life for residents. Many of these cities have developed transportation systems that improve efficiency and sustainability, as well as quality. Integrating cutting-edge transportation technology and data-driven solutions improves safety, reduces environmental impact, optimizes traffic flow during peak hours, and reduces congestion. Intelligent transportation systems consist of many systems, one of which is traffic sign detection. This type of system utilizes many advanced techniques and technologies, such as machine learning and computer vision techniques. A variety of traffic signs, such as yield signs, stop signs, speed limits, and pedestrian crossings, are among those that the traffic sign detection system is trained to recognize and interpret. Ensuring accurate and robust traffic sign recognition is paramount for the safe deployment of self-driving cars in diverse and challenging environments like the Arab world. However, existing methods often face many challenges, such as variability in the appearance of signs, real-time processing, occlusions that can block signs, low-quality images, and others. This paper introduces an advanced Lightweight and Efficient Convolutional Neural Network (LE-CNN) architecture specifically designed for accurate and real-time Arabic traffic sign classification. The proposed LE-CNN architecture leverages the efficacy of depth-wise separable convolutions and channel pruning to achieve significant performance improvements in both speed and accuracy compared to existing models. An extensive evaluation of the LE-CNN on the Arabic traffic sign dataset that was carried out demonstrates an impressive accuracy of 96.5% while maintaining superior performance with a remarkably low inference time of 1.65 s, crucial for real-time applications in self-driving cars. It achieves high accuracy with low false positive and false negative rates, demonstrating its potential for real-world applications like autonomous driving and advanced driver-assistance systems....
Loading....